-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid SmallVec::collect
#64949
Avoid SmallVec::collect
#64949
Conversation
This commit reduces instruction counts for several benchmarks by up to 5%.
This commit reduces instruction counts for several benchmarks by up to 5%.
Also avoid interning when it's not necessary. This commit reduces instruction counts for a couple of benchmarks by up to 1%.
r? @zackmdavis (rust_highfive has picked a reviewer for you, use r? to override) |
Some local perf results:
@bors try @rust-timer queue |
Awaiting bors try build completion |
Avoid `SmallVec::collect` We can get sizeable speed-ups by avoiding `SmallVec::collect` when the number of elements is small.
// This code is hot enough that it's worth specializing for the most | ||
// common length lists, to avoid the overhead of `SmallVec` creation. | ||
// The match arms are in order of frequency. The 1, 2, and 0 cases are | ||
// typically hit in ~95% of cases. We assume that if the upper and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this assumption OK? It seems to run counter to the Iterator
docs -- In particular, are you sure that no unsafe
code in the compiler relies on the correctness of collecting? You could specialize for TrustedLen
in which case this assumption is definitely OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First I tried adding blanket TrustedLen
requirements, but some of the iterators that feed into this method don't implement TrustedLen
.
Then I tried to use specialization but failed. The problem is that we have this function:
fn intern_with<I: Iterator<Item=Self>, F: FnOnce(&[T]) -> R>(iter: I, f: F)
And we want one behaviour if I
implements TrustedLen
, and another behaviour if it doesn't. But you can't do that with an argument. This function appears within an impl
block for Result<T, E>
, i.e. I
isn't part of the impl
type.
As for the existing code... if the iterator gives fewer elements than the size hint (e.g. 1 when the hint was (2, Some(2))
) then an unwrap
will safely fail within the function. If the iterator gives more elements than the size hint (e.g. 2 when the hint was (1, Some(1))
) then the wrong value will be interned. This could certainly cause correctness issues. It seems unlikely it could lead to unsafety, though I can't guarantee it. Also, it seems unlikely that an iterator would erroneously claim an exact size hint (i.e. where the lower bound and the upper bound match).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say that reasoning based on probability makes me very confident in the safety of this change. :/
How bad/expensive would it be to assert!
that calling next()
again yields None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any unsafe code here, and size_hint
is required to produce conservative lower and upper bounds. If the bounds are the same, we know the size exactly.
☀️ Try build successful - checks-azure |
f(&[]) | ||
} | ||
_ => { | ||
f(&iter.collect::<Result<SmallVec<[_; 8]>, _>>()?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the small cases are already taken care of above, could it be a win to just collect to Vec here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't measured, but I suspect Vec
would be slower because it always requires an allocation. The SmallVec
only requires an allocation if the length exceeds 8, which is extremely rare.
Also, this collect-into-SmallVec
idiom is very common within this module, so I think it's good for consistency reasons, too.
Thanks! 💖 @bors r+ |
📌 Commit d1a7bb3 has been approved by |
Avoid `SmallVec::collect` We can get sizeable speed-ups by avoiding `SmallVec::collect` when the number of elements is small.
☀️ Test successful - checks-azure |
I didn't do a CI perf run before landing, but here are the post-landing result. Lots of wins, up to 7.1%. |
We can get sizeable speed-ups by avoiding
SmallVec::collect
when the number of elements is small.